Computational Nonlinear Stochastic Control
نویسندگان
چکیده
N UMEROUS fields of science and engineering present the problem of uncertainty propagation through nonlinear dynamic systems [1]. One may be interested in the determination of the response of engineering structures: beams/plates/entire buildings under random excitation (in structure mechanics [2]), the motion of particles under the influence of stochastic force fields (in particle physics [3]), or the computation of the prediction step in the design of a Bayesian filter (in filtering theory) [4,5], among others. All these applications require the study of the time evolution of the probability density function (PDF) p t;x , corresponding to the state x of the relevant dynamic system. The PDF is given by the solution to the Fokker–Planck–Kolmogorov equation (FPE), which is a partial differential equation (PDE) in the PDF of the system, defined by the underlying dynamic system’s parameters. In this paper, approximate solutions of the FPE are considered, and subsequently leveraged for the design of controllers for nonlinear stochastic dynamic systems. In the past few years, the authors have developed a generalized multiresolution meshless finite element method (FEM) methodology, partition of unity FEM (PUFEM), using the recently developed global local orthogonal mappings methodology to provide the partition of unity functions [6] and the orthogonal local basis functions for the solution of the FPE. The PUFEM is a Galerkin projection method, and the solution is characterized in terms of a finite dimensional representation of the Fokker–Planck operator underlying the problem. Themethodology is also highly amenable to parallelization [7–10]. Though the FPE is invaluable in quantifying the uncertainty evolution through nonlinear systems, perhaps its greatest benefit may be in the stochastic analysis, design, and control of nonlinear systems. In the context of nonlinear stochastic control, Markov decision processes (MDPs) have long been one of the most widely used methods for discrete time stochastic control. However, the dynamic programming (DP) equations underlying the MDPs suffer from the curse of dimensionality [11–13]. Various approximate dynamic programming methods have been proposed in the past several years for overcoming the curse of dimensionality [13–17], and can be broadly categorized under the category of functional reinforcement learning. These methods are essentially a model-free method of approximating the optimal control policy in stochastic optimal control problems. These methods generally fall under the category of value function approximation methods [13], policy gradient/approximation methods [15,16], and actor-critic methods [14,17]. These methods attempt to reduce the dimensionality of the DP problem through a compact parametrization of the value functions (with respect to a policy or the optimal function) and the policy function. The difference in the methods is mainly through the parametrization that is employed to achieve the aforementioned goal, and range from nonlinear function approximators such as neural networks [14] to linear approximation architectures [13,18]. These methods learn the optimal policies by repeated simulations on a dynamic system and thus can take a long time to converge to a good policy, especially when the problem has continuous state and control spaces. In contrast, the methodology proposed here is model-based and uses the finite dimensional representation of the underlying parametrized diffusion operator to parametrize both the value function as well as the control policy in the stochastic optimal control problem. Considering the low-order finite dimensional controlled diffusion operator allows us to significantly reduce the dimensionality of the planning problem,while providing a computationally efficient recursive method for obtaining progressively better control policies. The literature on computational methods for solving continuous time stochastic control problems is relatively sparse when compared to the discrete time problem. One of the approaches is through the use of locally consistent Markov decision processes [19]. In this approach, the continuous controlled diffusion operator is approximated by a finite dimensional Markov chain that satisfies certain local consistency conditions, namely that its drift and diffusion coefficients match that of the original process locally. The resulting finite state MDP is solved by standard DP techniques, such as value iteration and policy iteration. The method relies on a finite difference discretization and thus can be computationally very intensive in higher-dimensional spaces. In another approach [20,21], the diffusion process is approximated by a finite dimensional Markov chain through the application of generalized cell-to-cell mapping [22]. However, even this method suffers from the curse of dimensionality because it involves discretizing the state space into a grid, which becomes increasingly infeasible as the dimension of the system grows. Also, finite difference and finite element methods have been applied directly to the nonlinear Hamilton–Jacobi–Bellman (HJB) partial differential equation [23,24]. The method proposed here differs in that it uses policy iteration in the original infinite dimensional function space, along with a finite dimensional representation of the controlled diffusion operator to solve the problem. Considering a lower-order approximation of the underlying operator results in a significant reduction in dimensionality of the computational problem. Using the policy iteration algorithm typically results in having to solve a sequence of a few linear equations (typically less than five) before practical convergence is obtained, as opposed to solving a high-dimensional nonlinear equation if the original nonlinear HJB equation is solved. The literature for solving deterministic optimal control problems in continuous time is relatively mature when compared to its stochastic counterpart. The method of successive approximations/ Received 13 February 2008; revision received 10 December 2008; accepted for publication 16 December 2008. Copyright © 2009 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved. Copies of this paper may be made for personal or internal use, on condition that the copier pay the $10.00 per-copy fee to theCopyright Clearance Center, Inc., 222RosewoodDrive,Danvers,MA01923; include the code 0731-5090/ 09 $10.00 in correspondence with the CCC. ∗Ph.D. Student, Department of Aerospace Engineering; mrinal@neo. tamu.edu. http://people.tamu.edu/~mrinal. Student Member AIAA. Assistant Professor, Faculty of Aerospace Engineering. Member AIAA. Royce E. Weisenbaker Distinguished Professor, Aerospace Engineering. Fellow AIAA. JOURNAL OF GUIDANCE, CONTROL, AND DYNAMICS Vol. 32, No. 3, May–June 2009
منابع مشابه
Computational method based on triangular operational matrices for solving nonlinear stochastic differential equations
In this article, a new numerical method based on triangular functions for solving nonlinear stochastic differential equations is presented. For this, the stochastic operational matrix of triangular functions for It^{o} integral are determined. Computation of presented method is very simple and attractive. In addition, convergence analysis and numerical examples that illustrate accuracy and eff...
متن کاملWilson wavelets for solving nonlinear stochastic integral equations
A new computational method based on Wilson wavelets is proposed for solving a class of nonlinear stochastic It^{o}-Volterra integral equations. To do this a new stochastic operational matrix of It^{o} integration for Wilson wavelets is obtained. Block pulse functions (BPFs) and collocation method are used to generate a process to forming this matrix. Using these basis functions and their operat...
متن کاملComputational Nonlinear Stochastic Control based on the Fokker-Planck-Kolmogorov Equation
The optimal control of nonlinear stochastic systems is considered in this paper. The central role played by the Fokker-Planck-Kolmogorov equation in the stochastic control problem is shown under the assumption of asymptotic stability. A computational approach for the problem is devised based on policy iteration/ successive approximations, and a finite dimensional approximation of the control pa...
متن کاملNumerical solution of nonlinear SPDEs using a multi-scale method
In this paper we establish a new numerical method for solving a class of stochastic partial differential equations (SPDEs) based on B-splines wavelets. The method combines implicit collocation with the multi-scale method. Using the multi-scale method, SPDEs can be solved on a given subdomain with more accuracy and lower computational cost than the rest of the domain. The stability and c...
متن کاملOptimal control of nonlinear dynamic econometric models: An algorithm and an application
OPTCON is an algorithm for the optimal control of nonlinear stochastic systems which is particularly applicable to econometric models. It delivers approximate numerical solutions to optimum control problems with a quadratic objective function for nonlinear econometric models with additive and multiplicative (parameter) uncertainties. The algorithm was programmed in C# and allows for determinist...
متن کاملParticle Model Predictive Control: Tractable Stochastic Nonlinear Output-Feedback MPC
We combine conditional state density construction with an extension of the Scenario Approach for stochastic Model Predictive Control to nonlinear systems to yield a novel particle-based formulation of stochastic nonlinear output-feedback Model Predictive Control. Conditional densities given noisy measurement data are propagated via the Particle Filter as an approximate implementation of the Bay...
متن کامل